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Abstract. We study the syntactic complexity of finite/cofinite, definite 
and reverse definite languages. The syntactic complexity of a class of 
languages is defined as the maximal size of syntactic semigroups of lan- 
guages from the class, taken as a function of the state complexity n of 
the languages. We prove that (n — 1)! is a tight upper bound for fi- 
nite/cofinite languages and that it can be reached only if the alphabet 
l_J ■ size is greater than or equal to (n — 1)! — {n — 2)!. We prove that the 

pL ' bound is also (n — 1)! for reverse definite languages, but the alphabet 

^ , size is (n— 1)! — 2(n — 2)!. We show that [e- (n — 1)!J is a lower bound on 

(^ • the syntactic complexity of definite languages, and conjecture that this 

is also an upper bound, and that the alphabet size required to meet this 
.. . bound is [e • (n— 1)!J — [e • (n — 2)!J. We prove the conjecture for n < 4. 

^ ■ Keywords: definite, finite automaton, finite/cofinite, regular language, 

CO ' reverse definite, syntactic complexity, syntactic semigroup 
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^ ' 1 Introduction 

m 

^^ ' A language is definite if it can be decided whether a word w belongs to it simply 

by examining the suffix of w of some fixed length. The class of definite languages 
was the very first subclass of regular languages to be considered: it was intro- 
duced in 1954 in the classic paper by Kleene [10]. It was then studied in 1963 

K^ ' by Perles, Rabin, and Shamir [15], and Brzozowski [2], in 1966 by Ginzburg [8], 

^ . and later by several others. Definite languages were revisited in 2009 by Bor- 

dihn, Holzer and Kutrib [1] in connection with state complexity. Reverse definite 
languages were first studied by Brzozowski [2]. Here membership of w can be 
determined by its prefix of some fixed length. The class of finite and cofinite 
languages is the intersection of the definite and reverse definite classes. Here 
testing for membership can be done by checking all words shorter than some 
fixed length. These three classes appear at the bottom of the dot-depth hierar- 
chy [5] of star-free languages, below generalized definite languages and locally 
testable languages. All three classes are boolean algebras. The semigroup S* of a 
finite/cofinite language is nilpotent: It has a single idempotent e which is a zero, 
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and is characterized by the equations eS = Se = e. For definite (reverse definite) 
languages every idempotent e is a right zero, that is, Se ~ e (respectively, a left 
zero, that is, eS = e). 

We study the sizes of syntactic semigroups of finite/cofinitc, definite, and 
reverse definite languages. If i C Z"* is a regular language over alphabet U, its 
syntactic semigroup is defined by the Myhill congruence [14] w^: For x, y e S* , 

X ~L y if and only if uxv Cz L ^ uyv G L for all u, w G S* . 

The set S^ / «i of equivalence classes of the relation «i is the syntactic semi- 
group of L. It is well-known that this semigroup is isomorphic to the semigroup 
Tl of transformations performed by non-empty words in the minimal determin- 
istic finite automaton (DFA) recognizing L [13], and it is usually convenient to 
deal with the latter semigroup. It is obvious that the transformation semigroup 
of the minimal DFA of L is identical to that of the minimal DFA of L, the 
complement of L. 

The syntactic complexity cr{L) of a language L is the size of its syntactic 
semigroup, and cr{L) = \Tl\, where \S\ denotes the cardinality of a set S. Syn- 
tactic complexity can vary significantly among languages with the same state 
complexity [6] , where the state covfiplexity of a language is the number of states 
in its minimal DFA. 

The observation that n" is a tight upper bound on the size of the transfor- 
mation semigroup of a DFA with n states was first made by Maslov [12] in 1970, 
although this follows immediately from a 1935 result of Piccard [16], who showed 
that three generators suffice to produce all transformations of a set of n elements. 
The interest in syntactic complexity of subclasses of regular languages is new. In 
2003-2004 Holzer and Konig [9], and Krawetz, Lawrence and Shallit [11] studied 
unary and binary languages. In 2011 Brzozowski and Ye [6] showed the fol- 
lowing bounds: right ideals — tight upper bound n"^^; left ideals — lower bound 
n""i + n - 1; two-sided ideals— lower bound n"-'^ + {n- 2)2"-2 + i, i^ 2012 Br- 
zozowski, Li and Ye [4] found the following bounds: prefix-free languages — tight 
upper bound n"^^; suffix- free languages — lower bound (n— l)"^^-|-n — 2; bifix- 
free languages — lower bound {n — 1)"^"^ + (n — 2)"^'^ + (n — 3)2"~^; factor-free 
languages — lower bound (n — 1)"^'^ -i- {n — 3)2"^'^ -I- 1. Also in 2012 tight upper 
bounds were found for three subclasses of star-free languages by Brzozowski and 
Li [3]: monotonic languages — C^~^] partially monotonic languages — f{n) = 
X]fc=o ^k~^ ^k^^^"^ '^ nearly monotonic languages — f{n) -\-n—\, where Cj is the 
binomial coefficient i choose j. It was conjectured in [3] that the bound for nearly 
monotonic languages is also a tight upper bound for star-free languages. That 
bound is asymptotically 2-3/4(v^ + l)2"-7y/7r(n - 1). 

We prove that (n — 1)! is a tight upper bound for finite/cofinitc languages, 
and that a growing alphabet of size at least (n — 1)! — (n — 2)! is required to 
reach the bound. For reverse definite languages the bound is also {n — 1)!, but 
the alphabet size is now (n — 1)! — 2(n — 2)!. We show that [e • (n — 1)!J is a 
lower bound for definite languages, and that it can be reached with an alphabet 
of size [e • (n — 1)!J — [e • (n — 2)!J . We conjecture that this is also an upper 
bound, and prove the conjecture for n < 4. 



There is a lack of left-right symmetry in several results for syntactic com- 
plexity in spite of the fact that the syntactic congruence is symmetric. Thus, in 
the case of ideals [6], it was easy to find a tight upper bound for right ideals, 
but no tight upper bound is known for left ideals. It was easy to find a tight 
upper bound for prefix-free languages, but no tight upper bound is known for 
suffix-free languages [4]. This happens again here. We have a tight upper bound 
for reverse definite languages, but no tight upper bound for definite languages. 

Section 2 contains some preliminary material. Sections 3-5 discuss the syn- 
tactic complexity of finite/cofinite, reverse definite, and definite languages, re- 
spectively, and Section 6 concludes the paper. 



2 Preliminaries 

A transformation of a set Q is a mapping of Q into itself. We consider only 
transformations of finite sets, and assume without loss of generality that Q = 
{1, 2, . . . , n}. If i is a transformation of Q, and i (z Q, then it is the image of i 
under t. An arbitrary transformation can be written in the form 

/I 2 ■■■ n-1 n 
\i1i2 ■ ■ ■ in-i in 

where ik = kt, 1 < k < n, and ik G Q. We also use the notation t ~ [zi, 12, . . . , in] 
for the transformation t above. 

If AT is a subset of Q, then Xt = {it \ i E X}, and the restriction of i to X, 
denoted by t|x, is a mapping from X to Xt such that it\x = it for all i G X. 

A permutation of Q is a mapping of Q onto itself. A transformation t is 
permutational if there exists some X <Z Q with \X\ > 2 such that t\x is a 
permutation of X. Otherwise, t is non-permutational. 

A constant transformation, denoted by C^), has it = j for all i. 

The composition of two transformations ti and ^2 of Q is a transformation 
tiot2 such that i{tiot2) = {iti)t2 for alH e Q. We usually omit the composition 
operator. 

A deterministic finite automaton (DFA) is a quintuple V — (Q, S,d,qi, F), 
where Q is a finite, non-empty set of states, Z" is a finite non-empty alphabet, 
S : Q X E -^ Q is the transition function, qi G Q is the initial state, and F C Q 
is the set of final states. We extend 5 to Q x S* in the usual way. The DFA T) 
accepts a word w & S* ii S{qi,w) G F. The set of all words accepted by V is 
the language L{'D) of V. Two states of a DFA are distinguishable if there exists 
a word w which is accepted from one of the states and rejected from the other. 
Otherwise, the two states are equivalent. A DFA is minimal if all of its states are 
reachable from the initial state and no two states are equivalent. All the minimal 
DFA's of a given language L are isomorphic. 

The notion of a DFA V connects transformations to regular languages. Given 
a regular language L, its minimal DFA V = (Q, S, 5, qi, F), and a word w € S~^ , 
the transition function S{-,w) is a transformation of Q, the transformation 



caused by w. When eonvenient, we identify a word with its corresponding trans- 
formation. 

The (left) quotient of a language L C S* by a word w € S* is the language 
Lw = {x I wx € L}. Note that L^ — L, where e is the empty word. The quotient 
DFA of a regular language L is V = {Q, S, S, qi, F), where Q — {L^ \ w G S*}, 
d{Lw,a) = Luja, qi = L^ = L, and F = {L^ \ e e L^}- The quotient DFA is 
isomorphic to the minimal DFA accepting L. 

3 Finite/Cofinite Languages 

One of the simplest classes of regular languages is the class of finite and cofinite 
languages, where a language is cofinite if its complement is finite. Since the 
syntactic complexity bounds for finite and cofinite languages are identical, we 
restrict our analysis here to finite languages. 

Let L be a regular language and V — {Q, S,S,qi,F) be its minimal DFA. 
It is well-known that L is finite/cofinite if and only if there exists a numbering 
1, . . . , n on Q so that for all w e S* , 5{i, w) — j implies that i < j or i = j = n. 
We define the set An of transformations on {1, 2, . . . ,n} with these properties: 

A-n = {t \ it > i y i — 1, . . . , n — 1, and nt — n}. 

It is clear that An is a semigroup under composition of size (n — 1)!. 

Theorem 1. Let L be a finite or cofinite language with state complexity n. Then 
the syntactic complexity of L satisfies cr{L) < {n — 1)! and this bound is tight. 

Proof. Let V — {Q, S, S, qi,F) be the minimal DFA of L. The above discussion 
implies that we may label the states Q so that T^ is a subsemigroup of A„. 
Therefore the bound holds. 

Let n > 1 and \X!\ — (n — 1)!. Let 2? be a DFA with states numbered 
{1, 2, . . . , n}, initial state 1, sink state n, and a final state n — 1. For each trans- 
formation t e An, assign a letter in S whose input transformation on T) is 
exactly t. To show that V is minimal, note that state i > 1 is reached from the 
initial state by the transformation [i, n, n, . . . , n]. Also, if i and j are two states 
and i < j < n, then the transformation t € An that has it — n — 1, and kt = n 
for all other k ^ i, distinguishes the two states. Hence T) is minimal and accepts 
a finite language. Therefore the bound is tight. D 

A natural question is the minimal size of the alphabet required to achieve the 
upper bound. Let T) be the minimal DFA of a finite language L with Tl ~ An- 
For any state i ^ Q and a G S, it is clear that 5{i,a) > i + 1 or i = n. It 
follows that if an input transformation t G An satisfies it = i + 1 ioi some 
i G {1, 2, . . . , n — 2}, then any word w corresponding to t must have length 1, 
that is, w must be in S. 



Theorem 2. Let L C S* be a finite or co finite language with state complex- 
ity n > 3, and suppose that cr(L) = (n — 1)!. Then 

\S\ >{n- 1)!- (n-2)! 

and this hound is tight. 

Proof. By Theorem 1, we may assume that T^ = An. The preceding discussion 
imphes that \S\ is at least the number of transformations which satisfy it = i-\-l 
for some i — 1, . . . ,n — 2. Let G„ C An be the set of these transformations. If we 
place the restriction it y^ i + 1 for alH € {1, 2, . . . , n — 2} then there are n — i—1 
choices for these it, and hence a total of (n — 2)! such transformations. Therefore 
\Gn\ ^\An\-{n- 2)! = {n- 1)! - (n - 2)!. Now let t^[ji,..., jn-2,n, n] G A„ 
be arbitrary. Let 

k = min ji, — i| — 1, 

and t' = [ji — k, . . . ,jn-2 — k, n, n]. Then t' £ Gn and t = t'[2, 3, . . . , n— 1, n, n]^ . 
Thus Gn generates j4„, and the bound is tight. D 

Example 1. For n = 4, the largest semigroup is 

^4 = {[2, 3, 4, 4], [2, 4, 4, 4], [3, 3, 4, 4], [3, 4, 4, 4], [4, 3, 4, 4], [4, 4, 4, 4]}, 
and its minimal generating set is shown in boldface. 

4 Reverse Definite Languages 

A reverse definite language is a language L C 17* of the form L — E U FE* , 
where E and F are finite languages. Because reverse definite languages are char- 
acterized by prefixes of a fixed length, their minimal DFAs (and hence syntactic 
complexity bounds) are very similar to those of finite/cofinite languages. If L 
has state complexity 1, then either L ^ % oi L = S* . Since both these languages 
are in the finite/cofinite class, the bound (n — 1)! of Theorem 1 applies. For 
state complexities n > 1, we note first that if is not a quotient of L, then L is 
cofinite. Otherwise, and S* are both quotients of L. Let V — {Q,IJ,6,qi,F) 
be the minimal DFA of L, and label the states corresponding to and E* with 
n — 1 and n, respectively. One can number the other states in Q so that for all 
words w € E* , if S{i, w) ~ j then i < j with equality if and only if i € {n—l,n}. 
The syntactic complexity results for reverse definite languages now follow 
directly from the finite/cofinite results. 

Theorem 3. Let L ~ E (J FE* be a reverse definite language with state com- 
plexity n > 1. Then a{L) < (n — 1)!, and this bound is tight. Moreover, if this 
language L achieves this upper bound and n > 4, then \E\ > (n — 1)! — 2(n — 2)!, 
and this bound is tight. 



Proof. First, if is not a quotient of L, then L is cofinitc and hence has the 
same bounds as in the previous section. To find a cofinitc witness L meeting the 
bound (n— 1)!, first find a finite witness L as in the proofs of Theorems f and 2, 
and then interchange its final and non-final states. 

Otherwise, let V be the minimal DFA recognizing L, and let the states be 
totally ordered as in the preceding discussion. Define the set of transformations 
analogous to the finite case: 

j4„ = {i I ii > i Vi = 1, . . . , n — 2, (n — l)t = n — f , and nt ~ n}. 

Then Tl C A^, which a straightfoward calculation shows to be a semigroup. 
Clearly, \A'^\ — [n — 1)!, thus proving the bound. 

To find a witness for this case, start with the finite witness as in the proofs 
of Theorems 1 and 2, make all transitions from state n — 1 to go to itself, and 
make state n the only final state. 

For the minimal size of the alphabet, we define G'^ C A'^ to be the set 
of transformations t in A'^ satisfying it = i + 1 for some i = f,...,n — 3. 
As in Section 3, these transformations must correspond to individual letters 
in S, hence proving the bound. The same indirect counting argument shows 
that for n > 4, \G'„\ = (n — 1)! — 2 • (n — 2)!. A similar argument also shows 
that G^ generates A'^ (using the transformations [2,3, ... ,n — l,n — l,n] and 
[2, 3, . . . ,n,n— l,n] in place of [2,3, . . . , n— 1, n,n]). Therefore the alphabet size 
bound is tight. D 

Example 2. For n = 4, the finite witness meeting the bound (n — f)! has the 
transformation set given in Example f . We modify this set by making n — 1 the 
sink state, thus obtaining 

a; -{[2, 3, 3, 4], [2, 4, 3, 4], [3, 3, 3, 4], [3, 4, 3, 4], [4, 3, 3, 4], [4, 4, 3, 4]}, 

where the generators arc in boldface, and state 4 is final. 

5 Definite Languages 

A definite language is a language L C 17* of the form L = E U E*E, where E 
and F are finite languages. Like finite/cofinite and reverse definite languages, 
definite languages are characterized by their transformation semigroups. In this 
case, every transformation of the minimal DFA of a regular language must be 
non-permutational. Conversely, if the transformation semigroup of a minimal 
DFA contains only non-permutational transformations, then it accepts a definite 
language. 

Our goal for this section is to find the maximal size of a non-permutational 
transformation semigroup, that is, one which contains only non-permutational 
transformations. There is a straightforward bijection between such transforma- 
tions on {1, . . . , n} and simple labeled forests on n — 1 nodes. This can be seen 
by constructing the graph on n nodes with edges ij representing it — j, and 



then removing the unique node for which it ~ i. Then Cayley's Theorem [7, 17] 
shows that there are n"^^ non-pcrmutational transformations of {1, . . . , n}. 

Identifying non-permutational transformations is not sufficient to find a syn- 
tactic complexity bound, as the set of such transformations does not form a semi- 
group for n > 3. For example, the composition of s = [2,3,3] and t = [1,1,2] 
is st ~ [1,2,2], which is permutational. Two transformations conflict if there 
exists a permutational transformation in the semigroup that they generate. 

We exhibit the following sets of non-permutational transformations which do 
not confiict; they are similar to the semigroup A„ from Section 3. 

Theorem 4. Let n > 1, and define the following sets of transformations: 

Bn,k — {t\it>iyi<i<k, and it = ky i > fc}, fc = 1, 2, 3, . . . , n. 

n 

Then the set of transformations _B„ = I J i?„,fe is a maximal non-permutational 



semigroup of size [e • (n — 1)!J . 



fe=i 



Proof. One can check that each Bn.k is a semigroup. Let ti e Bn^i and tj e Bn.j, 
with i < j. A direct computation shows that titj € Bn.it , and tjti € Bn,i', hence 
Bn is a semigroup. Moreover, for all t e Bn.k, t'^^^ = (fc), and so all of the 
transformations are non-permutational. 
A simple counting argument shows that 

(n - IV 

\Bn.k\ ^{n- l){n - 2) • • • (n - fc + 1) = ) J-. 

[n — k)\ 

Since the Bn^k are disjoint. 

For the maximality of _B„, wc show that adding any other non-permutational 
transformation creates a conflict. Let t ^ Bn be non-permutational, with it = i. 

First suppose that there exists & j < i with jt = k < j. Since t is non- 
permutational, we may assume fc < j. Then there exists a i' e Bn.i with kt' ~ j; 
then itt' — i and jtt' — j, and so t and t' conflict. 

If no such j exists, then there must exist a j > i with jt ^ i. Consider the 
sequence defined by jo — j, ji — ji~it- If there exists an I such that jit = ji+i < i, 
let I be the minimal one. Let t' G Sn.j, with ji+it' — i and it' — ji. Then 
itt' = ji, jitt' = i, and so tt' is permutational. Now suppose all ji > i. Since t is 
non-permutational, i must appear in the sequence; moreover, since ji = jt ^ i, 
we can pick / > so that i ~ ji+2- Since j(_|_i > i, we may find a transformation 
t' G Bjj^ with it' — 7(_|_i and ji+it' — ji. Then it't = i, kt't = fc, and t't is 
permutational. D 



To compute the generators of i?„, we require the fohowing definition. Let C„ 
be the set of all transformations t — [ii, . . . ,i„] G _B„ with all ij < n. Define the 
function a : C'n — > -Bn by a{t) = [ii + 1, ...,«„ + 1], and also 

a{Cn) = {t e Bn\ a{to) = t for some to E C„}. 

Clearly, a is a bijection between C„ and a{Cn)- 

Theorem 5. Let i/„ = Bn\a{Cn). Then 

(1) Hn is the minimum set of generators for Bn ■ 

(2) \Hr,\^[e-{n^l)\\-[e-{n-2)\\. 

Proof. For (1), note that [2, 3, . . . , n, n] € iJ„. For any t £ Bn, we can write 
t — to[2,3, . . . ,n, n]'' with fc > and to G Hn, as in the proof of Theorem 2. 
Therefore Hn generates Bn- 

Now let ti G Bn,i and tj € Bnj, with i > j. We consider mtitj, and use the 
fact that each transformation t G Bn^k satisfies mt > min{A:,TO + 1}- There are 
two cases: 

(a) If ?Ti > J — 1, then mti > min{i, to + 1} > j, hence mtitj — j. 

(b) li m < j — 2 < i, then mti > m + 1, hence mtitj > min{j, mti + 1} > to + 2. 

It follows that a~^{titj) € -B„j_i; a similar argument shows that a~^{tjti) G 
Bn,jti-i- Consequently, no transformation in Hn is a composition of two others 
in Bn , and so Hn is the minimum generating set of _B„ . 

For (2), we calculate |a(C„)|, or equivalently |C„| because a is a bijection. 
A counting argument shows that \Bn^k H C„| = (n-2-(k-i))t - Therefore 

\Hn\ = \BnHa{Cn)\ - \Bn\-J2 (^n ~2 - {k- 1))\ = b-(n-l)!J-Le-(n-2)!J. 

n 

The following corollary establishes a direct connection with definite lan- 
guages. 

Corollary 6 For all n > 1, there exists a definite language L with state com- 
plexity n, syntactic complexity cr{L) — [e • (n — 1)!J, and alphabet size [e ■ {n — 
l)!J-Le-(n-2)!J. 

Proof Let V = (Q, S, S, qi,F) be a DFA with Q ^ {1,2,. . . ,n}, qi = 1, F = 
{n}, and \E\ = [e ■ {n — 1)!J — [e • (n — 2)!J with each letter representing a 
different transformation in Hn, so that the transformation semigroup of T> is 
Bn- We claim that this is a minimal DFA of a definite language. First, all the 
states are reachable by the constant transformations Cj) G Bn- Also, any two 
states i,j with i < j < n are distinguishable by the transformation t G Bn which 
acts as fct = fc + 1 for 1 < fc < i, and kt — n ior k > i. State n is distinguishable 
from every other state because it is the only final state. Hence V is minimal. 
Then by Theorem ??, T> accepts a definite language. D 



Conjecture 7 Let L be a definite language with state complexity n > 1. Then 
(^{L) < [e • (n— 1)!J , and if equality holds then \S\ > [e • (n — 1)!J — [e • (n — 2)!J . 

Example 3. For n ~ 4 we have the foUowing transformations in i3„: 

54,1= {[1,1, 1,1]}, 

^4,2 = {[2, 2, 2, 2], [3, 2, 2, 2], [4, 2, 2, 2]}, 

54.3 = {[2, 3, 3, 3], [2, 4, 3, 3], [3, 3, 3, 3], [3, 4, 3, 3], [4, 3, 3, 3], [4, 4, 3, 3]}, 

54.4 = {[2, 3, 4, 4], [2, 4, 4, 4], [3, 3, 4, 4], [3, 4, 4, 4], [4, 3, 4, 4], [4, 4, 4, 4]}. 

The generators are shown in boldface. 

6 Conclusions and Future Work 

Though we have found tight upper bounds on the syntactic complexity of fi- 
nite/cofinite and reverse definite languages, we have only conjectured the bounds 
on the syntactic complexity and the corresponding alphabet size for definite lan- 
guages. The conjecture has been verified through computational enumeration for 
n < 4, but remains unproven for n > A. Also, syntactic complexity bounds have 
yet to be found for the related higher classes in the dot-depth hierarchy of star- 
free languages, namely the generalized definite and locally testable languages. It 
is possible that the technique used in this paper — characterize allowable trans- 
formations in the syntactic semigroup and apply combinatorial arguments to 
count them — can be used to find bounds for these languages as well. 
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